Proof Engineering Challenges for Large-Scale Verification
نویسنده
چکیده
In this extended abstract I summarise challenges for proof engineering that we encountered in the formal verification of the seL4 microkernel [7], and its subsequent proofs of integrity [12], non-interference [10], and binary correctness [11]. I focus on problems where there is scope for automation using AI and machine-learning techniques. For more background on the seL4 verification, and an analysis of the effort spent on it, see previous work [6]. The seL4 kernel is a 3rd generation microkernel in the L4 family [9]. Such kernels provide basic operating system (OS) mechanisms such as virtual memory, synchronous and asynchronous messages, interrupt handling, and in the case of seL4, capability-based access control. The idea is that, using these mechanisms, one can isolate software components in time and space from each other, enabling separate compositional verification of trusted components as well as proof that no such correctness is required of untrusted components, because the kernel and its policy configuration already sufficiently constrain their behaviour [2]. The verification of seL4 was not a large project by industrial software development standards, but it was sizeable for an academic formal verification project. The functional correctness proof of seL4 took roughly 12 person years, the overall initial project, including tool building, libraries, and research in scalable proof techniques, usable semantics of the C programming language, etc. took about 25 person years; for a more precise analysis see [6]. This effort later paid off in the proof of high-level security properties: they were much easier to show, because they could now be established on an abstract specification instead of directly on the code. Integrity cost less than 8 person months, non-interference less than 21 person months, and updates to the kernel to add a separation scheduler cost another 21 person months, including updates to all existing proofs. Automatic binary verification for functional correctness then extended these properties down to the low-level semantics of ARMv6 machine instructions. The largest of these proofs, the initial functional correctness verification produced about 200,000 lines of Isabelle/HOL proof scripts [7] with a team of on average 12 people over 4 years (about 7 full-time equivalent). During the subsequent proofs, the seL4 kernel evolved. While there were no C-level defects to fix in the verified code base, changes included performance improvements, API simplifications, additional features, and occasional fixes to parts of the non-verified code base of seL4, such as the initialisation and assembly
منابع مشابه
Challenges and Experiences in Managing Large-Scale Proofs
Large-scale verification projects pose particular challenges. Issues include proof exploration, efficiency of the edit-check cycle, and proof refactoring for documentation and maintainability. We draw on insights from two large-scale verification projects, L4.verified and Verisoft, that both used the Isabelle/HOL prover. We identify the main challenges in large-scale proofs, propose possible so...
متن کاملA Review Approach to Detecting Violations of Consistency between Specification and Program Structures
The application of specification-based program verification techniques (e.g., black-box testing, formal proof) faces strong challenges in practice when the gap between the structure of a specification and that of its program is large. This paper describes a viewbased program review approach to addressing these challenges. The essential idea of the approach is first to derive comparable views fr...
متن کاملProof Engineering Considered Essential
In this talk, I will give an overview of the various formal verification projects around the evolving seL4 microkernel, and discuss our experience in large-scale proof engineering and maintenance. In particular, the presentation will draw a picture of what these verifications mean and how they fit together into a whole. Among these are a number of firsts: the first code-level functional correct...
متن کاملChallenges in Aligning Requirements Engineering and Verification in a Large-Scale Industrial Context
[Context and motivation] When developing software, coordination between different organizational units is essential in order to develop a good quality product, on time and within budget. Particularly, the synchronization between requirements and verification processes is crucial in order to assure that the developed software product satisfies customer requirements. [Question/problem] Our resear...
متن کاملA Regression Proof Selection Tool For Coq
Large-scale software verification projects increasingly rely on proof assistants, such as Coq, to construct formal proofs of program correctness. However, such proofs must be checked after every change to a project to ensure expected program behavior. This process of regression proving can require substantial machine time, which is detrimental to productivity and trust in evolving projects. We ...
متن کامل